Statistical Curvature and Stochastic Complexity
نویسندگان
چکیده
We discuss the relationship between the statistical embedding curvature [1, 2] and the logarithmic regret [11] (regret for short) of the Bayesian prediction strategy (or coding strategy) for curved exponential families and Markov models. The regret of a strategy is defined as the difference of the logarithmic loss (code length) incurred by the strategy and that of the best strategy for each data sequence among a considered class of prediction strategies. (The considered class is referred to as a reference class.) Since a prediction strategy is equivalent to a probability distribution, the class of prediction strategy is equivalent to a statistical model. Note that the logarithmic loss (equivalent to code length) by the minimax strategy is equal to Rissanen’s stochastic complexity (SC). SC is generalization of Minimum Description Length [8, 3] and plays an important role in statistical inference such as model selection, universal prediction, universal coding, etc. For this matter, it can be shown that the Bayesian strategy with Jeffreys prior (Jeffreys strategy for short) asymptotically achieves SC upto the constant term, when the reference class is an exponential family[12, 13, 16]. This is due to the fact that the logarithmic loss of Bayes mixture strategy is affected by the exponential curvature of the considered class. Hence, the Jeffreys strategy does not achieve the SC in general, if the reference class is not an exponential family. For a curved exponential family case, in order to obtain the minimax regret, we give a method to modify the Jeffreys mixture by assuming a prior distribution on the exponential family in which the curved family is embedded. We also consider the expected version of regret (known as redundancy in information theory field). When the true probability distribution belongs to the reference class, the Jeffreys strategy asymptotically achieves the minimax redundancy, irrelevant to the curvature of the reference class as shown by Clarke and Barron [6]. However, if the true probability distribution does not belong to the reference class, the situation differs and the redundancy of Jeffreys strategy is affected by both exponential and mixture curvatures of the reference class. Finally, we study the exponential curvature of a class of Markov sources defined by a context tree (tree model). Tree models are classified to FSMX models and non FSMX models. It is known that FSMX models are exponential families in asymptotic sense. We are interested in the problem if non FSMX models are exponential families or not. We show that a certain kind of non FSMX tree model is curved in terms of exponential curvature.
منابع مشابه
Local Minimax Complexity of Stochastic Convex Optimization
We extend the traditional worst-case, minimax analysis of stochastic convex optimization by introducing a localized form of minimax complexity for individual functions. Our main result gives function-specific lower and upper bounds on the number of stochastic subgradient evaluations needed to optimize either the function or its “hardest local alternative” to a given numerical precision. The bou...
متن کاملStochastic Bound Majorization
Recently a majorization method for optimizing partition functions of log-linear models was proposed alongside a novel quadratic variational upper-bound. In the batch setting, it outperformed state-of-the-art firstand second-order optimization methods on various learning tasks. We propose a stochastic version of this bound majorization method as well as a low-rank modification for highdimensiona...
متن کاملReconstruction of Surface and Stochastic Dynamics from a Planar Projection of Trajectories
We show how to reconstruct a two-dimensional surface, the drift field, and the diffusion tensor from a planar projection of trajectories of a diffusion process on the surface. The reconstruction is based on the stochastic differential equations of the projected motion, whose drift and diffusion tensor depend on the local curvature of the surface. The reconstruction process requires the solution...
متن کاملRobust uncapacitated multiple allocation hub location problem under demand uncertainty: minimization of cost deviations
The hub location–allocation problem under uncertainty is a real-world task arising in the areas such as public and freight transportation and telecommunication systems. In many applications, the demand is considered as inexact because of the forecasting inaccuracies or human’s unpredictability. This study addresses the robust uncapacitated multiple allocation hub location problem with a set of ...
متن کاملSome Notes on Rissanen's Stochastic Complexity Some Notes on Rissanen's Stochastic Complexity
A new version of stochastic complexity for a parametric statistical model is derived, based on a class of two-part codes. We show that choosing the quantization in the rst step according to the Fisher information is optimal and we compare our approach to a recent result of Rissanen 10]. Application to robust regression model selection is presented.
متن کامل